Experimental Assessment of Parallel Systems
نویسندگان
چکیده
In the research reported in this paper, transient faults were injected in the nodes and in the communication subsystem (by using software fault injection) of a commercial parallel machine running several real applications. The results showed that a significant percentage of faults caused the system to produce wrong results while the application seemed to terminate normally, thus demonstrating that fault tolerance techniques are required in parallel systems, not only to assure that long-running applications can terminate but also (and more important) that the results produced are correct. Of the techniques tested to reduce the percentage of undetected wrong results only ABFT proved to be effective. For other simple error detection methods to be effective, they have to be designed in, and not added as an after thought. Faults injected in the communication subsystem proved the effectiveness of end-to-end CRCs on the data movements between processors.
منابع مشابه
Parallel computation framework for optimizing trailer routes in bulk transportation
We consider a rich tanker trailer routing problem with stochastic transit times for chemicals and liquid bulk orders. A typical route of the tanker trailer comprises of sourcing a cleaned and prepped trailer from a pre-wash location, pickup and delivery of chemical orders, cleaning the tanker trailer at a post-wash location after order delivery and prepping for the next order. Unlike traditiona...
متن کاملAgeing Orders of Series-Parallel and Parallel-Series Systems with Independent Subsystems Consisting of Dependent Components
In this paper, we consider series-parallel and parallel-series systems with independent subsystems consisting of dependent homogeneous components whose joint lifetimes are modeled by an Archimedean copula. Then, by considering two such systems with different numbers of components within each subsystem, we establish hazard rate and reversed hazard rate orderings between the two system lifetimes,...
متن کاملA Hybrid Unconscious Search Algorithm for Mixed-model Assembly Line Balancing Problem with SDST, Parallel Workstation and Learning Effect
Due to the variety of products, simultaneous production of different models has an important role in production systems. Moreover, considering the realistic constraints in designing production lines attracted a lot of attentions in recent researches. Since the assembly line balancing problem is NP-hard, efficient methods are needed to solve this kind of problems. In this study, a new hybrid met...
متن کاملPreservation of Stochastic Orderings of Interdependent Series and Parallel Systems by Componentwise Switching to Exponentiated Models
This paper discusses the preservation of some stochastic orders between two interdependent series and parallel systems when the survival and distribution functions of all components switch to the exponentiated model. For the series systems, the likelihood ratio, hazard rate, usual, aging faster, aging intensity, convex transform, star, superadditive and dispersive orderings, and for the paralle...
متن کاملStochastic Comparisons of Series and Parallel Systems with Heterogeneous Extended Generalized Exponential Components
In this paper, we discuss the usual stochastic‎, ‎likelihood ratio, ‎dispersive and convex transform order between two parallel systems with independent heterogeneous extended generalized exponential components. ‎We also establish the usual stochastic order between series systems from two independent heterogeneous extended generalized exponential samples. ‎Finally, ‎we f...
متن کاملA Multi Objective Optimization Model for Redundancy Allocation Problems in Series-Parallel Systems with Repairable Components
The main goal in this paper is to propose an optimization model for determining the structure of a series-parallel system. Regarding the previous studies in series-parallel systems, the main contribution of this study is to expand the redundancy allocation parallel to systems that have repairable components. The considered optimization model has two objectives: maximizing the system mean time t...
متن کامل